  set more off
  mat drop _all

*===================================================================*
*   BIHAR EVALUATION OF SOCIAL FRANCHISING AND TELEMEDICINE (BEST)
*   this do-file defines the samples for each wave
*====================================================================*

* To define the main sample we 
* 1.  Only keep providers with complete data in the following instruments: 
* 1.a Providers' interview
* 1.b Vignette

*====================================================================*
* Define Main Sample for Baseline 
*====================================================================*

* Open Providers' survey
  use "$prodata1\providers_interview_1st", clear

* Keep main variables
  keep prov_id wave cluster treat sample1
  count

* Merge vith vignette 
/*NOTE: same if we do it with diarrhea o pneumo because each provider evaluate both cases*/
  merge 1:1 prov_id wave using "$prodata1\vignette_diarrhea", keepus(sample4)
  keep if _merge==3
  tab sample1 sample4
  drop sample4
  drop _merge

* Save (temporal)
  drop sample1
  tempfile baseline_sample
  save "`baseline_sample'"

*====================================================================*
* Sample for Follow up
*====================================================================*

* Open Providers' survey
  use "$prodata2\providers_interview_2nd", clear

* Keep main variables
  keep prov_id wave cluster treat sample1 
  count

* Merge vith vignette 
/*NOTE: same if we do it with diarrhea o pneumo because each provider evaluate both cases*/
  merge 1:1 prov_id wave using "$prodata2\vignette_diarrhea", keepus(sample4)
  tab sample1 sample4
  keep if sample1==1 & sample4==1
  drop sample4
  drop _merge

* Save
  drop sample1
  append using "`baseline_sample'"
  order cluster treat, after(wave)
  sort wave cluster prov_id
  save "$prodata\sample_by_wave", replace
